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Abstract 

A local Random Walk Method (RWM) for potential 
problems governed by Lapalace’s and Poisson's 
equations is developed for two- and three- 
dimensional problems. The RWM is implemented 
and demonstrated in a multiprocessor parallel 
environment on a Beowulf cluster of computers. A 
speed gain of 16 is achieved as the number of 
processors is increased from 1 to 23. 

Introduction 

The Finite Element Method (FEM) is a widely used 
numerical method for structural analysis. For large- 
scale structural analysis, FEM needs extensive 
preprocessing and hence may be costly and time 
consuming. The Boundary Element Method (BEM) is 

considered to be an alternative to FEM for a certain 
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class of problems ’ . In the BEM, it is necessary to 
model only the boundary, and hence BEM reduces the 
dimensions of the problem. Both FEM and BEM have 
their own advantages and disadvantages. Hence 
methods to couple FEM and BEM have evolved 3 to 
take advantages of both methods. A third class of 
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methods denoted as meshless methods has been 
developed as a replacement for FEM and BEM. All 
the above-mentioned methods can be grouped as 
global methods. These global methods can provide 
solutions to stress, displacement, and other responses 
for all the points in the structure. The global methods 
also invariably need to form and invert a very large 
system matrix to obtain the complete field solution. 
Recently local methods such as the Random Walk 
Method (RWM) were proposed in reference 7 to 


obtain the solution at an arbitrary point, without 
having to obtain the complete field solution. These 
local methods are based on probabilistic 
interpretations of certain partial differential equations. 
For these local methods, there is no necessity to 
discretize the domain or the boundary. Also they are 
highly economical, if the solution is needed at only at 
a few selected points in the structure. These methods 
are simple to program and inherently parallel. This 
feature of the local methods makes it most suited for 
analysis in a cluster of computers, such as Beowulf 

g 

cluster . In the cluster, several computers processors 
(CPU) are networked together such that each 
processor uses its own local memory but is able to 
communicate with other processors by sending and 
receiving messages. In the present paper the Random 
Walk Method (RWM) for potential problems 
(governed by Laplace and Poisson’s equations) for 
two- and three-dimensions is adopted from reference 
7 and implemented in a Beowulf cluster of 
computers. The efficiency of the RWM is 
demonstrated using the Coral Beowulf cluster at 
ICASE (for details see reference 9), NASA Langley 
Research Center. 

In this paper the technical details of the RWM are 
presented first, followed by the presentation of two 
examples (taken from reference 7) to verify the 
development. Next a brief introduction to the Coral 
Beowulf cluster at ICASE, Langley Research Center 
is provided along with the implementation of the 
RWM in the cluster. Next Laplace’s Equation in a 
multiply connected domain is analyzed in the 
Beowulf cluster to measure the speed-up ratio 
obtained in the multi processor cluster compared to 
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the single processor implementation. Finally, 
conclusions from the study are presented. 

Technical details of the RWM 

The first step in the development of the RWM for 
Lapalace’s and Poisson's equations for potential flow 
problems is to define the Ito diffusion processes for 
Brownian motion based on the mean value theorem. 
The full details of the method can be found in 
reference 7, so only a brief summary presented here. 
The Brownian motion is a random, zigzag motion of 
microscopic particles, characterized by the stationary 
independent increments of non-overlapping time 
intervals. For example consider a Brownian motion 
process {£(*),*> 0} taking place in a time interval, 

s <s 2 ^s 3 <t . The increments B(t)-B(s 3 ) and 
B(s 2 )~ B(s) are independent. For a time interval, 
s < t , the motion of the particle is a Gaussian vector 
with mean zero and covariance matrix l(t-s ), 

where / denotes the identity matrix. The ltd 
diffusion processes starts at an arbitrary point xin a 
domain £2 at time t = 0 and defined for a function 
#(*) in terms of average rate of change of g (#(*)) 
as 

£[*(«(»)]-*(*) = ^fev 2 *(B(*))<&] (1) 

2 d d 2 

where V = £ — j is the Laplace operator, E is the 
i=\dx~ 

expectation with the starting point B( 0) = x , and 
d - 2 for two-dimensions and d- 3 three- 
dimensions. 

The Ito formula in Equation (1), can be generalized 
by replacing t with a random time T . The random 
time T is defined as the time the Brownian motion 
leaves the domain D for the first time starting at 
xs D . The averaged Ito formula with t replaced by 
T can be written as 

E[g(B(T)]-g(x) =lfi[f o r V 2 s(B(5))<fc] (2) 


This is the key equation for solving the Laplace and 
Poisson’s equations by the RWM. 

Laplace’s and Poisson’s Equations 

Let wbe the solution of the Poisson’s equation 
defined in a domain £2 , bounded by a boundary T 

V 2 u(x) + p(x) = 0 jcg£2 (3) 

satisfying the Dirichlet boundary conditions 

w(*) = £(*), xeF u (4) 

where p and £ are specified functions and T is the 
boundary on which the Dirichlet boundary conditions 
are specified. The value of unknown function u can 
be written using ltd formula in Equation (2), using 
V 2 u(B(s)) = -p(B(s)) as 

u(x) = £[£(B(f)]+I4 o r (5) 

The right hand side of equation (5) depends only on 
the expectations of the known functions £ and p , 

and samples of the Brownian motion B in the time 
interval (0,7) . The expectation in equation (5) can 

be estimated using Monte Carlo simulation, since, 
generally it is not possible to evaluate the integrals 
analytically. Note that the solution to Laplace’s 
equation can be obtained by substituting p = 0 in 
equation (5). 

The RWM is most suited for Dirchlet boundary 
conditions given in equation (4). For mixed boundary 
conditions the Brownian motion is reflected at the 
Newmann boundary as described in reference 7. In 
the present paper only problems with Dirchlet 
boundary conditions are considered. 
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Figure 1. Laplace’s equation solved on multiply 
connected domain 


Numerical Examples in a single processor: 

Two examples from reference 5 are selected to verify 
the RWM in a single processor implementation. 

These examples are described below: 

Laplace’s Equation on a Multiply Connected 
Domain: 

In the first example, Laplace’s equation is solved in a 
multiply connected two-dimensional domain as 
shown in Figure L For the dimensions and boundary 
conditions shown in Figure 1, the exact solution for 
the potential at any point within the domain is 5 


1. Start the Brownian motion of a particle at 
any point P(x py y p ) inside the domain 
where the solution of the potential y/ is to be 
determined 

2. Select (x,,y ( ) as equal to (x py y p ) 

3. Select the time step At for the Brownian 
motion. 

4. The position of the particle at the end of the 
current step can be determined using the 
Brownian motion properties as 

x i = x { + Va7 Random(e) 

(7) 

y, = y, + ^J~At Random{e) 

where Random(e) is a random number 
generator function that returns a random 
number from the set of a normally 
(Gaussian) distributed random numbers with 
mean zero and unit variance. Note that the 
incremental steps in the x and y directions 
are different. 

5. Repeat step 4 until the particle reaches either 
of the two boudaries and exits the domain. 
Record the value of the potential y/ e at the 
exit point. (The particle is said to be 
absorbed at the boundary). 

6. Go to step 1 for the next sample in Monte 
Carlo simulation with the same starting point 

(x p ,y p ) 

7. Repeat steps 1-5, for N number of samples. 

8. Calculate the value of the potential at the 
point P(x py y p ) as 

¥p=-^l¥ e ( 8 ) 

N ,=i 


¥ exact = 50 1 


2ln£ 

with a = {} = 2 + V3 


-In 


(x-a) 2 + y 2 
(ax-l)* h- a 2 y 2 


( 6 ) 


The following Monte Carlo simulation procedure was 
used to calculate the potential along a circle of radius 
r , for example r-R x =0.8 (shown as the dotted line 

in Figure 1). 


The time step (Af ) and number of samples N are 
the two important parameters in the RWM. Two sets 
of parameters At = 0.01, N = 100 and 
At =0.0001, N = 5000 were selected to study the 
effect of the parameters time step Af and sample size 
N on the solution. The potentials are calculated at 21 
locations equally spaced around the circle of radius 
0.8. Figure 2 shows the comparison of the potential 
calculated using the RWM and the exact solution for 
the two sets of parameters. The solution accuracy 
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improves upon decreasing the time step A t and 
increasing the number of samples N . 
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Figure 2. Potential {if/) along the circle of radius 

r ~ R x =0.8 


Poisson's Problem on an ellipsoid: 

The second example is a Possion’s problem on a 
ellipsoid with Dirichlet boundary conditions. The 
ellipsoid is described by the equation 



The Dirichlet boundary conditions on the surface of 
the ellipsoid are applied according to the equation 

V exact =X 2 +y 2 +Z 2 (10) 

The ellipsoid major axes are set to 

unity, (a = b = c = 1.0) . The potentials along a circle 

x 2 + y 2 = r 2 ( with r = 0.8) on the plane z- 0 are 
calculated using the 8 step Monte Carlo simulation 
procedure described in the previous example. 
However, the potential at step 5 is calculated using 

Ve = *e + y] + Z] ( 11 ) 

The potentials are calculated at 21 locations along the 
circumference of the circle. The potentials obtained in 
the analysis are shown and compared with the 


reference solution in Figure 3 for Af = 0. 1 and the 
number samples N = 5000. Good agreement is 
shown between the analysis results and the reference 
solution. 



Figure 3. Potential (^) along the circle with 

x 2 + y 2 = (0.8) 2 and z = 0 . 

Coral Beowulf Cluster at ICASE in Langley 
Research Center 


Coral is a 96-CPU Beowulf cluster with a dual CPU 
400 MHz Pentium II server as the front end, while 
two dual CPU 500 MHz Pentium III machines act as 
file servers. The cluster runs with Linux operating 
sysytem. There are 64 computing nodes consisting of 
8 Pentium III single processors at 400 MHz, 16 
Pentium III dual processors at 500 MHz, 16 Pentium 
III dual processors at 800 MHz and 24 Pentium IV 
single processors at 1700 MHz. The parallel programs 
on Coral are written using the Message Passing 
Interface (MPI) standard and are fully described in 
reference 8. Three different MPI implementations are 
available on Coral . While accessing the Coral cluster, 
the user can request all the computing nodes, can 
request few selected nodes or can specify the number 
of nodes required. 



4 

American Institute of Aeronautics and Astronautics 




AIAA-2002-1654 


Multiply Connected Domain on Laplace’s Eqution 
in Multi Processor Implementation 

The multiply connected domain shown in Figure 1 of 
example i is used in the multiprocessor environment 
to calculate the potentials at 21 stations (locations) 
along the circumference of the circle with radius, 
r= =0.8. The number of samples is selected as 

12,000 for this case. Two type of multi processor 
configurations are studied and described below. 

Configuration I: Samples shared in processors: In this 
configuration, for each station, the 
12,000 samples are shared across 
the processors. For example, in an 
analysis with 4 processors, each 
processor will handle 3000 
samples. The total time taken for 
the analysis is calculated by 
summing the time taken for the 21 
stations. In other words, in this 
configuration the stations are 
sequentially processed, while the 
number of samples are run in 
parallel and shared across the 
cluster. 


Configuration II: Stations shared in processors: In this 
configuration, the 21 stations are 
run in parallel on 21 processors, 
while in each processor the 12,000 
samples are run sequentially. The 
total time taken for the analysis is 
calculated by summing the time 
taken in each of the 21 processors. 

Configurations I and II are run on the Coral Beowulf 
cluster using 1700 MHzPentium IV single processors 
nodes. For both the configurations, the processor 
nodes are divided so that there is one master node and 
the remaining processors are slave nodes. The master 
node maintains all the communication to the slave 
nodes. It also receives the data from slave nodes and 
compiles it for output. 

For Configuration I, the number of processors is 
varied from 1 to 23 including the master node. For 
configuration II, 22 processors are used, one for the 
master node and 21 processors for the 21 stations. 
The total time taken for the multiprocessor analyses 


are shown in Figure 4, where it can be seen that the 
total time for the Configuration I reduced from 
approximately 16 seconds to 1 second as the number 
of processors varied from 1 to 23. The 22 processors 
in configuration I take almost same the time as the 
configuration II with 22 processors. This implies that 
the each random walk is highly independent of the 
others and little time is wasted in communication 
between processors. 

In order to measure the speed gain in the 
multiprocessor analysis, a speed -up ratio is defined 
as 

0 , Time taken in singleprocessor 

Speed - up ratio = — - — - 

Time taken in multiple processors 

( 12 ) 

The speed-up ratio is shown in Figure 5 for 
Configurations I and II. It can be seen that the speed 
gain of 16 is obtained as the number of processors 
increased from 1 to 22. 



Figure 4 : Variation of analysis time with the 
number of processors 



Figure 5. Speed-up ratio in the multi processors 
analyses 
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SUMMARY 


A random walk method for potential problems 
governed by Lapalace’s and Poisson’s equations is 
developed for two- and three- dimensional problems. 
The method is demonstrated in a Beowulf cluster of 
computers. A multiply connected domain problem 
governed by Laplace’s equation is analyzed with the 
number of processors ranging from 1 to 23. Two 
types of processors sharing are utilized in the parallel 
implementation. Using the Multiprocessor parallel 
method a speed gain of 16 is achieved as the number 
processors increased from 1 to 22. 
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